Word Meanings across Languages Support Efficient Communication
نویسندگان
چکیده
Why do languages have the semantic categories they do? Each language partitions human experience into a system of semantic categories, labeled by words or morphemes, which are used to communicate about experience. These categories often differ widely across languages. Thus, languages do not merely provide different labels for the same universally shared set of categories—instead, both the labels and the categories themselves may be to some extent language-specific. However this cross-language variation is constrained. Words with similar or identical meanings often appear in unrelated languages, and most logically possible meanings are unattested—suggesting that there are universal forces constraining the cross-language diversity. Accounting for this pattern of wide but constrained variation is a central theoretical challenge in understanding why languages have the particular forms they do.
منابع مشابه
Semantic typology and efficient communication
Cross-linguistic work on domains including kinship, color, folk biology, number, and spatial relations has documented the different ways in which languages carve up the world into named categories. Although word meanings vary widely across languages, unrelated languages often have words with similar or identical meanings, and many logically possible meanings are never observed. We review work s...
متن کاملWord lengths are optimized for efficient communication.
We demonstrate a substantial improvement on one of the most celebrated empirical laws in the study of language, Zipf's 75-y-old theory that word length is primarily determined by frequency of use. In accord with rational theories of communication, we show across 10 languages that average information content is a much better predictor of word length than frequency. This indicates that human lexi...
متن کاملTense systems across languages support efficient communication
All languages have ways of expressing location in time, but they differ widely in their grammatical tense systems. At the same time, there are tense systems that recur across unrelated languages. What explains this wide but constrained variation? Taking a functionalist perspective, we propose that tense systems are shaped by the need to support efficient communication–a need that has recently b...
متن کاملMultiLexExplorer: Combining Multilingual Web Search with Multilingual Lexical Resources
Multilingual lexical resources provide information about linguistic relation of words in-between languages. On the other hand, huge document collections like the World Wide Web provide statistical information about the distribution and co-occurrence of words in almost all languages. In this paper, we present a tool that was designed to combine the information from both resources in order to sup...
متن کاملCorpus-based Techniques for Word Sense Disambiguation
Consider the task of building a speech-to-speech translation system. One signi cant problem confronting the designer is the absence of a one-to-one mapping from word sounds to text strings to word meanings. The following examples reveal the ubiquity of this problem . In a highly homophonous language like Chinese the single sound sequence 'shi' maps to 56 di erent characters, each of which in tu...
متن کامل